智能论文笔记

IdeaReader: A Machine Reading System for Understanding the Idea Flow of Scientific Publications

Qi Li , Yuyang Ren , Xingli Wang , Luoyi Fu , Jiaxin Ding , Xinde Cao , Xinbing Wang , Chenghu Zhou

分类：自然语言处理

2022-09-27

了解出版物思想的起源和影响对于进行科学研究至关重要。但是，科学出版物的扩散使研究人员难以弄清所有相关文献的演变。为此，我们介绍了Ideareader，这是一种机器阅读系统，发现哪些论文最有可能激发或受到目标出版物的影响，并以自然语言总结了这些论文的想法。具体而言，Ideareader首先将目标出版物的参考和引用（一阶或高阶）和所获得的集群视为激发或受到目标出版物影响的主题。然后，它从每个集群中挑选出重要的论文来提取想法流的骨骼。最后，Ideareader会自动生成对每个主题中重要论文的文献综述。我们的系统可以帮助研究人员深入了解科学思想如何通过自动生成的调查和想法流的可视化来从目标出版物的引用引用。

translated by 谷歌翻译

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization

Jiaxin Qi , Kaihua Tang , Qianru Sun , Xian-Sheng Hua , Hanwang Zhang

分类：计算机视觉

2022-08-06

分布式概括（OOD）都是关于对环境变化的学习不变性。如果每个类中的上下文分布均匀分布，则OOD将是微不足道的，因为由于基本原则，可以轻松地删除上下文：类是上下文不变的。但是，收集这种平衡的数据集是不切实际的。学习不平衡的数据使模型偏见对上下文，从而伤害了OOD。因此，OOD的关键是上下文平衡。我们认为，在先前工作中广泛采用的假设，可以直接从偏见的类预测中注释或估算上下文偏差，从而使上下文不完整甚至不正确。相比之下，我们指出了上述原则的另一面：上下文对于类也不变，这激励我们将类（已经被标记为已标记的）视为不同环境以解决上下文偏见（没有上下文标签）。我们通过最大程度地减少阶级样本相似性的对比损失，同时确保这种相似性在所有类别中不变，从而实现这一想法。在具有各种上下文偏见和域间隙的基准测试中，我们表明，配备了我们上下文估计的简单基于重新加权的分类器实现了最新的性能。我们在https://github.com/simpleshinobu/irmcon上提供了附录中的理论理由和代码。

translated by 谷歌翻译

Invariant Feature Learning for Generalized Long-Tailed Classification

Kaihua Tang , Mingyuan Tao , Jiaxin Qi , Zhenguang Liu , Hanwang Zhang

分类：计算机视觉

2022-07-19

现有的长尾分类（LT）方法仅着眼于解决阶级的失衡，即头部类别的样本多于尾巴类，但忽略了属性的不平衡。实际上，即使班级平衡，由于各种属性，每个类中的样本仍然可能会长时间尾。请注意，后者在根本上比前者更加普遍和具有挑战性，因为属性不仅是大多数数据集的隐含，而且在组合上也具有复杂性，因此平衡的昂贵。因此，我们引入了一个新的研究问题：广义的长尾分类（GLT），共同考虑两种失衡。通过“广义”，我们的意思是，GLT方法自然应该解决传统的LT，但反之亦然。毫不奇怪，我们发现大多数class LT方法在我们提出的两个基准中退化：Imagenet-GLT和Mscoco-GLT。我们认为这是因为他们过分强调了班级分布的调整，同时忽略了学习属性不变的功能。为此，我们提出了一种不变特征学习（IFL）方法，作为GLT的第一个强基线。 IFL首先从不完美的预测中发现具有不同类内分布的环境，然后在其中学习不变的功能。有希望的是，作为改进的功能主链，IFL提高了所有LT阵容：一个/两阶段的重新平衡，增强和合奏。代码和基准可在GitHub上获得：https：//github.com/kaihuatang/generalized-long-tailed-benchmarks.pytorch

translated by 谷歌翻译

GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation

Lunyiu Nie , Shulin Cao , Jiaxin Shi , Jiuding Sun , Qi Tian , Lei Hou , Juanzi Li , Jidong Zhai

分类：自然语言处理

2022-05-24

Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax. Recent works have proposed several forms of supplementary supervision but none is generalized across multiple formal languages. This paper proposes a unified intermediate representation (IR) for graph query languages, named GraphQ IR. It has a natural-language-like expression that bridges the semantic gap and formally defined syntax that maintains the graph structure. Therefore, a neural semantic parser can more precisely convert user queries into GraphQ IR, which can be later losslessly compiled into various downstream graph query languages. Extensive experiments on several benchmarks including KQA Pro, Overnight, GrailQA, and MetaQA-Cypher under standard i.i.d., out-of-distribution, and low-resource settings validate GraphQ IR's superiority over the previous state-of-the-arts with a maximum 11% accuracy improvement.

translated by 谷歌翻译

Deconfounded Visual Grounding

Jianqiang Huang , Yu Qin , Jiaxin Qi , Qianru Sun , Hanwang Zhang

分类：计算机视觉 | 自然语言处理

2021-12-31

我们专注于视觉接地管道语言与位置之间的混淆偏见，在那里我们发现偏差是主要的视觉推理瓶颈。例如，接地过程通常是一种琐碎的语言 - 位置关联，没有视觉推理，例如，将任何包含绵羊的语言查询接地到近中心区域，由于绵羊在图像中心的地面真实位置存在地面真相位置。首先，我们将视觉接地管道框架框成了因果图，其显示图像，查询，目标位置和底层混淆之间的因果关系。通过因果图，我们知道如何打破接地瓶颈：Deconfounded视觉接地。其次，为了解决混乱的挑战，即一般而言，我们提出了一种呼吁呼吁：引用表达式解构器（红色），以消除混淆偏差。第三，我们实施红色作为一种简单的语言关注，可以以任何接地方法应用。在流行的基准测试中，红色通过显着的边缘改善了各种最先进的接地方法。代码将很快提供：https://github.com/jianqiangh/deconfounded_vg。

translated by 谷歌翻译

Shape from Polarization for Complex Scenes in the Wild

Chenyang Lei , Chenyang Qi , Jiaxin Xie , Na Fan , Vladlen Koltun , Qifeng Chen

分类：计算机视觉

2021-12-21

我们介绍了一种新的数据驱动方法，具有基于物理的前沿，从单个偏振图像到场景级正常估计。来自偏振（SFP）的现有形状主要专注于估计单个物体的正常，而不是野外的复杂场景。高质量场景级SFP的关键障碍是复杂场景中缺乏现实世界的SFP数据。因此，我们贡献了第一个现实世界场景级SFP数据集，具有配对输入偏振图像和地理正常映射。然后，我们提出了一种基于学习的框架，具有多头自我注意模块和观察编码，该框架被设计为处理由场景级SFP中的复杂材料和非正交投影引起的增加的偏振模糊。由于偏振光和表面法线之间的关系不受距离的影响，我们训练的模型可以广泛地展开到远场户外场景。实验结果表明，我们的方法在两个数据集中显着优于现有的SFP模型。我们的数据集和源代码将公开可用于\ url {https://github.com/chenyanglei/sfp-wild}。

translated by 谷歌翻译

POTATO: The Portable Text Annotation Tool

Jiaxin Pei , Aparna Ananthasubramaniam , Xingyao Wang , Naitian Zhou , Jackson Sargent , Apostolos Dedeloudis , David Jurgens

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-16

We present POTATO, the Portable text annotation tool, a free, fully open-sourced annotation system that 1) supports labeling many types of text and multimodal data; 2) offers easy-to-configure features to maximize the productivity of both deployers and annotators (convenient templates for common ML/NLP tasks, active learning, keypress shortcuts, keyword highlights, tooltips); and 3) supports a high degree of customization (editable UI, inserting pre-screening questions, attention and qualification tests). Experiments over two annotation tasks suggest that POTATO improves labeling speed through its specially-designed productivity features, especially for long documents and complex tasks. POTATO is available at https://github.com/davidjurgens/potato and will continue to be updated.

translated by 谷歌翻译

MOPRD: A multidisciplinary open peer review dataset

Jialiang Lin , Jiaxin Song , Zhangping Zhou , Yidong Chen , Xiaodong Shi

分类：人工智能 | 自然语言处理 | 机器学习

2022-12-09

Open peer review is a growing trend in academic publications. Public access to peer review data can benefit both the academic and publishing communities. It also serves as a great support to studies on review comment generation and further to the realization of automated scholarly paper review. However, most of the existing peer review datasets do not provide data that cover the whole peer review process. Apart from this, their data are not diversified enough as they are mainly collected from the field of computer science. These two drawbacks of the currently available peer review datasets need to be addressed to unlock more opportunities for related studies. In response to this problem, we construct MOPRD, a multidisciplinary open peer review dataset. This dataset consists of paper metadata, multiple version manuscripts, review comments, meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we design a modular guided review comment generation method based on MOPRD. Experiments show that our method delivers better performance indicated by both automatic metrics and human evaluation. We also explore other potential applications of MOPRD, including meta-review generation, editorial decision prediction, author rebuttal generation, and scientometric analysis. MOPRD is a strong endorsement for further studies in peer review-related research and other applications.

translated by 谷歌翻译

Towards Accurate Ground Plane Normal Estimation from Ego-Motion

Jiaxin Zhang , Wei Sui , Qian Zhang , Tao Chen , Cong Yang

分类：计算机视觉 | 机器人

2022-12-08

In this paper, we introduce a novel approach for ground plane normal estimation of wheeled vehicles. In practice, the ground plane is dynamically changed due to braking and unstable road surface. As a result, the vehicle pose, especially the pitch angle, is oscillating from subtle to obvious. Thus, estimating ground plane normal is meaningful since it can be encoded to improve the robustness of various autonomous driving tasks (e.g., 3D object detection, road surface reconstruction, and trajectory planning). Our proposed method only uses odometry as input and estimates accurate ground plane normal vectors in real time. Particularly, it fully utilizes the underlying connection between the ego pose odometry (ego-motion) and its nearby ground plane. Built on that, an Invariant Extended Kalman Filter (IEKF) is designed to estimate the normal vector in the sensor's coordinate. Thus, our proposed method is simple yet efficient and supports both camera- and inertial-based odometry algorithms. Its usability and the marked improvement of robustness are validated through multiple experiments on public datasets. For instance, we achieve state-of-the-art accuracy on KITTI dataset with the estimated vector error of 0.39{\deg}. Our code is available at github.com/manymuch/ground_normal_filter.

translated by 谷歌翻译

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

Zhongwei Wan , Yichun Yin , Wei Zhang , Jiaxin Shi , Lifeng Shang , Guangyong Chen , Xin Jiang , Qun Liu

分类：自然语言处理

2022-12-07

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e.g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora. However, this Domain-Adaptive Pre-Training (DAPT; Gururangan et al. (2020)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of General Memory Augmented Pre-trained Language Model (G-MAP), which augments the domain-specific PLM by a memory representation built from the frozen general PLM without losing any general knowledge. Specifically, we propose a new memory-augmented layer, and based on it, different augmented strategies are explored to build the memory representation and then adaptively fuse it into the domain-specific PLM. We demonstrate the effectiveness of G-MAP on various domains (biomedical and computer science publications, news, and reviews) and different kinds (text classification, QA, NER) of tasks, and the extensive results show that the proposed G-MAP can achieve SOTA results on all tasks.

translated by 谷歌翻译